<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>



  
  
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">


  


  
  
  
<!-- Terminology: owned and borrowed, vs owned and non-owned? -->
<!-- NEED NAME FOR TECHNIQUE, NOT JUST LANGUAGE: -->
<!--    "BORROWER COUNTING"? -->
<!--    Some bank lending analogy??-->

  <title>Ownership You Can Count On</title>
  <meta content="Adam Dingle" name="author">



  
  
  <style type="text/css">
h1 { font-size: 18pt;
font-weight: bold;
font-family: Arial,Helvetica,sans-serif;
text-align: center;
}
.author { font-family: Arial,Helvetica,sans-serif;
font-size: 12pt;
text-align: center;
}
address { text-align: center;
font-family: Arial,Helvetica,sans-serif;
font-size: 10pt;
font-style: normal;
}
body {
font-size: 9pt;
}
h2 { font-family: Times New Roman,Times,serif;
font-size: 12pt;
font-weight: bold;
}
h3 { font-family: Times New Roman,Times,serif;
font-size: 12pt;
font-weight: bold;
}
h4 { font-family: Times New Roman,Times,serif;
font-size: 11pt;
font-style: italic;
font-weight: normal;
}
  </style>
</head>


<body>



<h1>Ownership You Can Count On:<br>
A Hybrid Approach to Safe Explicit Memory Management
</h1>

<div class="author">Authors Omitted for Double-blind Review<br></div>


 
<h2>ABSTRACT</h2>



<!-- What is the problem? -->
While many forms of memory management have been proposed, the only ones to
achieve wide-spread adoption have been explicit de-allocation and garbage
collection.  

<!-- Why is it important? -->
This leaves programmers requiring explicit control of memory behavior unable to
obtain the software engineering and security benefits of programming in a safe
language.

<!-- What did we do? -->
We present a new approach to memory management [CALLED XXX] which combines type-based object
ownership and run-time reference counting of references by non-owning pointers.  When an
owning pointer is destroyed, if its (non-owning) reference count is zero, the
object is destroyed; otherwise a run-time error occurs.
We have implemented [XXX] in a safe C#-like language called Gel, which includes
powerful inter-procedural reference counting optimizations that take advantage
of ownership semantics.

<!-- How well did it work? -->

Our anecdotal experience in writing the Gel compiler and some smaller
benchmarks in Gel is that the annotation requirements and extra programming
effort are quite small, and the type system catches most ownership errors at
compile-time.  Quantitatively, we compare microbenchmarks written in Gel, Java,
C#, and C++, and versions of the Gel compiler written in both Gel and C#.  Gel
programs require at most [35%] more memory than the corresponding C++ programs,
while C# and Java require at least twice as much memory and often considerably
more.  Run-time performance varies considerably across all languages, with no
clear winner, but Gel outperforms either C++ or Java/C# in all but one
benchmark.  Our optimization eliminate a mean of [89%] of all reference counting
operations, speeding Gel programs by 20-40%.

<!-- End Abstract (but it's too long... trim) -->



<h3>INTRODUCTION</h3>



Today, most system programming is done in languages which are <dfn>unsafe</dfn>:
program errors may lead to arbitrary or undefined behavior.&nbsp; C
and C++ are unsafe languages commonly used for system
programming.&nbsp; Unsafe languages have certain fundamental
disadvantages over <dfn>safe</dfn> languages 
such as Java, C# and ML.&nbsp; Certain classes of bugs such
as buffer overruns or reads from deallocated memory can occur only in
unsafe languages.&nbsp; Many such bugs are subtle or
non-deterministic, and hence expensive in software
engineering.&nbsp; In addition, trusted and untrusted code may
safely share a single address space only in a safe language.
<p>There would thus be many advantages to using safe languages
for system programming.&nbsp; But there are challenges involved in
making safe languages as efficient as unsafe languages in their
run-time resource usage.&nbsp; In particular, any safe language
must have some sort of automatic memory management mechanism: explicit
memory allocation and deallocation as found in C and C++ is inherently
unsafe, as there is nothing to prevent a program from writing to
deallocated memory.&nbsp; The most widely used automatic memory
management mechanisms include classical reference counting and garbage
collection [Jones, 1996].&nbsp; Reference counting may incur
substantial run-time overhead and cannot easily free cyclic data
structures.&nbsp; Garbage collected-systems often use substantially
more memory than their manually managed counterparts.&nbsp; An
additional disadvantage of garbage collection is that it frees memory
non-deterministically.&nbsp; Deterministic destruction
is&nbsp;useful in system programming, especially in languages 
which allow arbitrary user code to run when objects are destroyed; such
code may be used to release system resources when no longer needed.</p>



<p>In this paper we propose an automatic memory management
mechanism based on ownership and run-time reference
counts.&nbsp; The mechanism can be used to implement a safe
language which frees objects deterministically.&nbsp; To show that
the ownership mechanism is practical, we've constructed a simple
object-oriented language called Gel which is essentially a subset of C# 
[Hejlsberg, 2004] 
extended with ownership-based memory management.&nbsp; We've
implemented an interpreter and compiler for Gel, written in the Gel
language itself.&nbsp;&nbsp; We've ported a number of common
benchmark programs to Gel and have found that the Gel implementations
often perform similarly to their C++ counterparts and use significantly
less memory than their Java and C# equivalents.</p>


<p>Unlike languages with simple manual memory management schemes
(such as malloc/free or new/delete), Gel is <dfn>safe</dfn>:
a bug in a Gel program can never corrupt memory arbitrarily.&nbsp;
Unlike garbage-collected languages, memory management in Gel is <i>deterministic</i>:
objects are freed as soon as they are no longer needed.
&nbsp;Unlike traditional reference-counting schemes, Gel can free
data structures including <dfn>cycles</dfn> of pointers as
described below.</p>

<p>In this paper, we&nbsp;describe only Gel's ownership
extensions to the
C# language. &nbsp;A reader familiar with either C# or Java should
be able to follow with no trouble; differences between the
core&nbsp;C# and Java languages are relatively minor and are not
relevant to the discussion here. For a complete description of Gel, see
the&nbsp;language reference [Gel].
&nbsp;The Gel project site (http://code.google.com/p/gel2/source)
contains the Gel interpreter/compiler, released as open source under
the MIT License.  Our decision to extend
C# rather than, say, Java was arbitrary, and it would be
straightforward to construct a similar language extending Java or
another object-oriented language.&nbsp; The Gel language is large
enough to write useful programs, such as the Gel interpreter/compiler
itself.</p>

<p>Our proposed memory management mechanism has the following
fundamental characteristics:</p>



<ul>



  <li>Objects are allocated individually using a heap
allocator.&nbsp; Deallocation is automatic; there is no <code>delete</code>
statement.</li>



  <li>The type system distinguishes <dfn>owning</dfn>
and <dfn>non-owning</dfn> pointers to objects.&nbsp;
This is a compile-time distinction only: at run time, all pointers are
represented as ordinary memory addresses.</li>



  <li>The type system ensures that at run time there is always
exactly one owning pointer to every object.</li>



  <li>Ownership of an object may <dfn>transfer </dfn>as
a program runs.&nbsp; Any owning pointer which loses ownership of
an object becomes either null (if it is a field) or inaccessible (if it is a
local variable or parameter).</li>



  <li>At run time, the language implementation keeps a reference
count of the number of non-owning pointers to each object.</li>



  <li>When a non-null owning pointer is destroyed, the object it
points to is destroyed as well.&nbsp; If a destroyed object has a
non-zero reference count, a run-time error occurs and the program is
terminated; it is the programmer's responsibility to avoid this
condition.</li>



</ul>


<p>Our goal in this work is to show that a safe language with such a type system
and memory management mechanism can be implemented, and can be practical in the
following senses:</p>

<ul>

  <li>It is easy to write real programs so that they pass the
type checker; implementing common data structures is straightforward.</li>



  <li>It is easy to write programs so that they never yield
run-time errors due to non-zero reference counts on destroyed objects.</li>



  <li>Programs run efficiently, ideally consuming roughly as much
memory and CPU time as their manually managed counterparts.</li>



</ul>



<p>The contributions of this work are:</p>

<ul>

  <li>A unique approach to safe memory management combining static type-based ownership
  with run-time reference counts;</li>

  <li>An implementation of the approach in a dialect of C# we call Gel;</li>

  <li>Implementation of the Gel compiler and a number of benchmarks in
  Gel;</li>

  <li>Intra- and inter-procedural optimizations that take advantage of
  ownership properties and make the reference counting cost acceptable by
  eliminating [89%] of all counting operations;</li> 

  <li>An experimental evaluation comparing the performance of one large Gel
  program (the Gel compiler) to the same program written in C#, and of several
  microbenchmarks written in Gel, C++, Java, and C#.</li>

</ul>



Currently, the major limitation of Gel is that its reference counting
optimizations can not easily be applied in the presence of aliasing arising
under multi-threading.  As a result, Gel currently only supports
single-threaded programs.


<br>
</body>
</html>
